g02gcf
g02gcf
© Numerical Algorithms Group, 2002.
Purpose
G02GCF Fits a generalized linear model with Poisson errors
Synopsis
[dev,idf,b,irank,se,cov,v,ifail] = g02gcf(x,isx,y<,link,a,wt,mean,offset,v,...
tol,maxit,iprint,epslon,weight,ifail>)
Description
A generalized linear model with Poisson errors consists of the
following elements:
(a) a set of n observations, y , from a Poisson distribution:
i
y -(mu)
(mu) e
-----------
y!
(b) X, a set of p independent variables for each observation,
x ,x ,...,x .
1 2 p
(c) a linear model:
--
(eta)= > (beta) x .
-- j j
(d) a link between the linear predictor, (eta), and the mean of
the distribution, (mu), (eta)=g((mu)). The possible links
functions are:
a
(i) exponent link: (eta)=(mu) , for a constant a,
(ii) identity link: (eta)=(mu),
(iii) log link: (eta)=log(mu),
____
(iv) square root link: (eta)=\/(mu),
1
(v) reciprocal link: (eta)= ----.
(mu)
(e) a measure of fit, the deviance:
n n { ( y ) }
-- ^^^^ -- { ( i ) ^^^^ }
> dev(y ,(mu) )= > 2{y log( -----)-(y -(mu) )}
-- i i -- { i ( ^^^^ ) i i }
i=1 i=1 { ( (mu) ) }
{ ( i) }
The linear parameters are estimated by iterative weighted
least-squares. An adjusted dependent variable, z, is
formed:
d(eta)
z=(eta)+(y-(mu)) ------
d(mu)
and a working weight, w,
( d(eta))2
w=((tau)d ------)
( d(mu) )
____
where (tau)=\/(mu).
At each iteration an approximation to the estimate of
^^^^^^
(beta), (beta), is found by the weighted least-squares
regression of z on X with weights w.
1/2 1/2
G02GCF finds a QR decomposition of w X, i.e., w X=QR where R
is a p by p triangular matrix and Q is an n by p column
orthogonal matrix.
^^^^^^
If R is of full rank, then (beta) is the solution to:
^^^^^^ T 1/2
R(beta)=Q w z
If R is not of full rank a solution is obtained by means of a
singular value decomposition (SVD) of R.
(D 0) T
R=Q (0 0)P ,
*
where D is a k by k diagonal matrix with non-zero diagonal
1/2
elements, k being the rank of R and w X.
This gives the solution
(Q 0)
^^^^^^ -1( * ) T 1/2
(beta)=P D (0 I )Q w z
1
P being the first k columns of P, i.e., P=(P P ).
1 1 0
The iterations are continued until there is only a small change
in the deviance.
The initial values for the algorithm are obtained by taking
^^^^^
(eta)=g(y)
The fit of the model can be assessed by examining and testing the
deviance, in particular by comparing the difference in deviance
between nested models, i.e., when one model is a sub-model of the
other. The difference in deviance between two nested models has,
2
asymptotically, a (chi) distribution with degress of freedom
given by the difference in the degrees of freedom associated with
the two deviances.
^^^^^^
The parameters estimates, (beta), are asymptotically Normally
distributed with variance-covariance matrix:
T
-1 -1
C=R R in the full rank case, otherwise
-2 T
C=P D P
1 1
The residuals and influence statistics can also be examined.
^^^^^ ^^^^^^
The estimated linear predictor (eta)=X(beta), can be written as
1/2
Hw z for an n by n matrix H. The ith diagonal elements of H,
h , give a measure of the influence of the ith values of the
i
independent variables on the fitted regression model. These are
known as leverages.
^^^^ -1 ^^^^^
The fitted values are given by (mu)=g ((eta)).
G02GCF also computes the deviance residuals, r:
_____________
^^^^ / ^^^^
r =sign(y -(mu) ) / dev(y ,(mu) ).
i i i \/ i i
An option allows prior weights to be used with the model.
In many linear regression models the first term is taken as a
mean term or an intercept, i.e., x = 1, for i=1,2,...,n. This
i,1
is provided as an option.
Often only some of the possible independent variables are
included in a model; the facility to select variables to be
included in the model is provided.
If part of the linear predictor can be represented by a variables
with a known coefficient then this can be included in the model
by using an offset, o:
--
(eta)=o+ > (beta) x .
-- j j
If the model is not of full rank the solution given will be only
one of the possible solutions. Only certain linear combimations
of the parameters will have unique estimates, these are known as
estimable functions.
Details of the SVD, are made available, in the form of the matrix
*
P :
( -1 T)
(D P )
( 1)
* ( T )
P =( P ).
( 0 )
Parameters
g02gcf
Required Input Arguments:
x (:,:) real
isx (:) integer
y (:) real
Optional Input Arguments: <Default>
link (1) string 'l'
a real 0.0
wt (:) real zeros(length(y),1)
mean (1) string 'm'
offset (1) string 'n'
v (:,:) real zeros(length(y),ip+7)
tol real 10*eps
maxit integer 10
iprint integer -1
epslon real eps
weight (1) string g02gcf04(wt)
ifail integer -1
Output Arguments:
dev real
idf integer
b (:) real
irank integer
se (:) real
cov (:) real
v (:,:) real
ifail integer